Discovering Predictive Association Rules

نویسندگان

  • Nimrod Megiddo
  • Ramakrishnan Srikant
چکیده

Association rule algorithms can produce a very large number of output patterns. This has raised questions of whether the set of discovered rules \over t" the data because all the patterns that satisfy some constraints are generated (the Bonferroni e ect). In other words, the question is whether some of the rules are \false discoveries" that are not statistically signi cant. We present a novel approach for estimating the number of \false discoveries" at any cuto level. Empirical evaluation shows that on typical datasets the fraction of rules that may be false discoveries is very small. A bonus of this work is that the statistical signi cance measures we compute are a good basis for ordering the rules for presentation to users, since they correspond to the statistical \surprise" of the rule. We also show how to compute con dence intervals for the support and con dence of an association rule, enabling the rule to be used predictively on future data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Soft-Matching Mined Rules to Improve Information Extraction

By discovering predictive relationships between different pieces of extracted data, data-mining algorithms can be used to improve the accuracy of information extraction. However, textual variation due to typos, abbreviations, and other sources can prevent the productive discovery and utilization of hard-matching rules. Recent methods for inducing softmatching rules from extracted data can more ...

متن کامل

Extraction of Interesting Association Rules Using Genetic Algorithms

The process of discovering interesting and unexpected rules from large data sets is known as association rule mining. The typical approach is to make strong simplifying assumptions about the form of the rules, and limit the measure of rule quality to simple properties such as support or confidence. Support and confidence limit the level of interestingness of the generated rules. Comprehensibili...

متن کامل

Using Association Rules for Fraud Detection in Web Advertising Networks

Discovering associations between elements occurring in a stream is applicable in numerous applications, including predictive caching and fraud detection. These applications require a new model of association between pairs of elements in streams. We develop an algorithm, Streaming-Rules, to report association rules with tight guarantees on errors, using limited processing per element, and minima...

متن کامل

Apriori Multiple Algorithm for Mining Association Rules

One of the most important data mining problems is mining association rules. In this paper we consider discovering association rules from large transaction databases. The problem of discovering association rules can be decomposed into two sub-problems: find large itemsets and generate association rules from large itemsets. The second sub-problem is easier one and the complexity of discovering as...

متن کامل

Cyclic Association Rules

We study the problem of discovering association rules that display regular cyclic variation over time. For example, if we compute association rules over monthly sales data, we may observe seasonal variation where certain rules are true at approximately the same month each year. Similarly, association rules can also display regular hourly, daily, weekly, etc., variation that is cyclical in natur...

متن کامل

On the Discovery of Interesting Patterns in Association Rules

Many decision support systems, which utilize association rules for discovering interesting patterns, require the discovery of association rules that vary over time. Such rules describe complicated temporal patterns such as events that occur on the “first working day of every month.” In this paper, we study the problem of discovering how association rules vary over time. In particular, we introd...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998